Estimating the Fraction of Non-Coding RNAs in Mammalian Transcriptomes

نویسندگان

  • Yurong Xin
  • Giulio Quarta
  • Hin Hark Gan
  • Tamar Schlick
چکیده

Recent studies of mammalian transcriptomes have identified numerous RNA transcripts that do not code for proteins; their identity, however, is largely unknown. Here we explore an approach based on sequence randomness patterns to discern different RNA classes. The relative z-score we use helps identify the known ncRNA class from the genome, intergene and intron classes. This leads us to a fractional ncRNA measure of putative ncRNA datasets which we model as a mixture of genuine ncRNAs and other transcripts derived from genomic, intergenic and intronic sequences. We use this model to analyze six representative datasets identified by the FANTOM3 project and two computational approaches based on comparative analysis (RNAz and EvoFold). Our analysis suggests fewer ncRNAs than estimated by DNA sequencing and comparative analysis, but the verity of our approach and its prediction requires more extensive experimental RNA data.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Long non-coding RNAs and their significance in human diseases

Protein-coding genes account for only a small fraction of the human genome and most of the genomic sequences are transcriptionally silent, but recent observations indicate significant functional elements, including non-coding protein transcripts in the human genome. Long non-coding RNAs (lncRNAs) have been defined as transcripts of >200 nucleotides without protein-coding capacity that perform t...

متن کامل

P87: The Role of the Long Non-Coding RNA Sequences (LncRNAs) in Neurological Disorders

Precise interpretation of the transcriptome sequences in the several species showed that the major part of genome has been transcribed; however, just a few amounts of the transcription sequences have open-reading frames which are conversed during the evolution. So, it is unlikely that many of the transcribed sequences code the proteins. Among the all human non-coding transcripts, at least 10000...

متن کامل

The Roles of Long non-coding RNAs (lncRNA) in Prostate Cancer

Background & Objective: Prostate cancer is a compound condition in which gene expression has altered. Several surveys have revealed that genetic components have been involved in prostate cancer progression. Findings proposed that they can modify a noteworthy portion of disposing of elements, which is associated to the developing prostate cancer in protein coding sequences. The purpose of this r...

متن کامل

The Role of Long Non Coding RNAs in the Repair of DNA Double Strand Breaks

DNA double strand breaks (DSBs) are abrasions caused in both strands of the DNA duplex following exposure to both exogenous and endogenous conditions. Such abrasions have deleterious effect in cells leading to genome rearrangements and cell death. A number of repair systems including homologous recombination (HR) and non-homologous end-joining (NHEJ) have been evolved to minimize the fatal effe...

متن کامل

The Long Non-Coding RNAs: A New (P)layer in the “Dark Matter”

The transcriptome of a cell is represented by a myriad of different RNA molecules with and without protein-coding capacities. In recent years, advances in sequencing technologies have allowed researchers to more fully appreciate the complexity of whole transcriptomes, showing that the vast majority of the genome is transcribed, producing a diverse population of non-protein coding RNAs (ncRNAs)....

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 2  شماره 

صفحات  -

تاریخ انتشار 2008